IV  v 

te'.?  . 


|kSw'v'4  i>?»*  'Is 

SK-*  y  *?. , 


.  V*  , 

^  ^  * 

B.4-  N:  .-,k* 

ipr  &W 
il&J..  ’  *••  jSS 


^V'"''  *  ' 


I 


*»?"•  ■ 
'lfr, 
^Kir-  ’’ . 


$&?.  ; 
fe.y-:  ; 


CEVELzr  (T) 


-  - - 

PERFORMANCE  EVALUATION  OF  MULTIPROCESSOR  SYSTEMS  j 

s»-.— . -—.-'t-.--—;-. . =A»-«sdb- 


CONTAINING  SPECIAL  PURPOSE  PROCESSORS 


Louisa 


/*— - * - -  ..— 

^  /  - 

Qj  Jane  W.S/Liu 

^ChungJ^.  (Liu  <~- 


DTIC 

sELECTE 
‘  JUL  3  0  1981 


Deoartment  of  Computer  Science 


University  of  Illinois 
Urbana,  Illinois 


E 

ion  n^lW^KW1 

/V  7‘ 


Nabuyoshi/ Miyazaki 
Haruaki/Yamazaki 


W  33  - 


OKI  Electric  Industry  Company,  Ltd. 
Tokyo,  Japan 


Hois  work  was  cartial' •i-^ucoo^ed-^y-'ne  Offi 
mder  Contract’.';0  CN?/n££#-79-C~?77S/ 


ce  of  Naval  Research 


I)ISMUTl6FrsTAmg5^A 

■&PPr^ved  for  public  release* 
Diatdbutioa  Unluriitt<i 


/  76  b( 


<4Ht 


* 


'  *  •  .*  i.--. 


Best 

Available 

Copy 


I 


Abstract 

/ 

u- 

The  relative  merits  of  two  different  types  of  multiprocessor 
systems  are  compared  in  terns  of  their  effective  processing  capabili¬ 
ties.  These  two  types  of  multiprocessors  are  (i)  one  which  contains 
general  purpose  processors  and  (-ii)  the  other  which  contains  special 
purpose  processors.  A  deterministic  model  and  queueing  theoretical 
models  of  these  systems  are  described.  The  potential  performance  im¬ 
provement  by  multitasking  is  discussed  in  terms  of  the'  number  of 
processors  and  the  degree  of  concurrency  in  jobSy. 


1.  Introduction 


In  recent  years,  progress  in  hardware  technology  and  system  architec¬ 
ture  has  made  the  design  and  implementation  of  large  and  complex  multipro¬ 
cessor  systems  possible.  By  a  multiprocessor  system,  we  mean  specifically 
here  a  computer  system  which  contains  two  or  more  closely  coupled  proces¬ 
sors  and  is  in  the  category  of  ME€)  (Multiple  Instruction  Streams,  Multiple 
Data  Stream)  systems.  The  architectural  differences  among  these  systems  can 
be  characterized  in  many  different  ways  Cl].  Here,  we  classify  multiproces- 
sor  systems  according  to  the  types  of  processors  in  the  system.  Some  multi¬ 
processor  systems  consist  of  identical  general. purpose  processors  which 
share  the  input  job  load.  Examples  of  this  type  of  multiprocessor  systems 
include  most  of  well  know  multiprocessors  and  closely-coupled  computer  net¬ 
works  such  as  IBM  360/65  or  370  MP,  Burroughs  B  5500  [23,  c.nwm.p.  C 33  and 
PRIME  CM.  Other  multiprocessor  systems  contains  special  purpose  processors 
(or  functionally  dedicated)  each  of  which  is  designed  or  programmed  to  per¬ 
form  efficiently  a  particular  type  of  functions.  The  types  of  special  pur¬ 
pose  processors  which  have  received  a  great  deal  of  attention  in  recent  years 
include  front-end  ccrrmunicaticn  processors  designed  to  deal  with  input  and 
output  of  low  speed  data  and  line  control  procedures,  back-end  processors 
(or  computers)  designed  to  relieve  the  host  system  tasks  involved  in  rranage- 
ment  of  data  bases,  array  processors,  intelligent  graphics  terminals,  sort- 
merge  processors,  etc.  designed  to  perform  special  functions  with  speeds 
normally  urachievable  using  general  purpose  hardwares.  As  a  natter  of  fact, 
one  commonly  used  technique  to  capitalize  the  cost /performance  potential 
of  VLSI  components  is  to  build  powerful  special  purpose  processors  and  use 
them  as  attached  processors  to  existing  computing  systems.  Thus,  certain 
functions  may  be  off-leaded  for  more  efficient  execution. 


Clearly,  for  a  multiprocessor  system  containing  special  purpose  pro¬ 
cessors  to  have  comparable  cost/performance  characteristics,  it  must  have 
seme  architectural  merits  (such  as,  fast  processor  speed,  more  reliable  com¬ 
munication  paths,  etc;)  to  compensate  for  the  potential  lack  of  such  desir¬ 
able  features  as  fail-softness,  expandability,  maintainability,  etc.,  pro¬ 
vided  through  redundancy  in  a  system  containing  general  purpose  processors. 


It  is  difficult  to  compare  the  relative  merits  of  the  two  types  of  multi¬ 
processor  systems  in  terms  of  these  criteria  in  general.  In  this  paper, 
we  are  concerned  with  their  relative  merits  when  they  are  compared  in 
terms  of  their  effective  processing  capabilities.  Using  a  deterministic 
model  and  several  approximate  queueing  theoretical  models  of  multiproces¬ 
sor  systems,  their  relative  performance  are  compared  using  various  mea¬ 
sures  of  effectiveness. 

'  In  Section  II,  a  general  deterministic  model  of  multiprocessor  sys¬ 
tems  is  described.  In  this  model,  each  type  of  special  purpose  processors 
is  further  divided  into  subtypes  with  a  partial  ordering  relation  defined 
over  the  processor  subtypes.  Thus  multiprocessor  systems  in  which  some 
processors  are  functionally  identical  but  have  dedicated  memories  of  diffe¬ 
rent  sizes  can  also  be  modelled.  This  model  is  used  to  obtain  a  worst  case 
bound  on  the  performance  of  priority  driven  scheduling  algorithms. 

To  study  the  performance  of  the  two  types  of  multiprocessor  systems 
from  another  point  of  view,  these  systems  are  modelled  using  approximate 
queueing  theoretical  models  in  Section  III. 

A  closely  re lately  problem  is  on  the  potential  performance  improve¬ 
ment  in  multiprocessor  systems  achievable  by  multitasking.  While  multitasx- 
ing  system  can  be  effectively  modelled  with  our  deterministic  model,  the 
queueing  models  in  Section  III  can  only  be  used  for  raultiprogranmed  systems 
We  discuss  in  Section  IV  a  special  queueing  model  of  multitasking  systems 
and  evaluate  the  potential  gain  in  processing  capability  achievable  by 
multitasking. 


Section  V  summarizes  our  conclusions  from  the  results  obtained  in  the 


crevious  sections. 


Let  S'  -  {T15T2,. . . ,Tn>  be  a  set  of  tasks  to  be  executed  on  a 
system  $.  A  task  is  said  to  be  a  type  (j,k)  task  if  it  can  be  exe¬ 
cuted  on  any  type  (i,v)  processor  for  ( j ,k)  <  (j,v)  and  on  no  other 
type  of  processors .  Let  a  be  a  function  from  3  to  the  processor  types 
(j,k)  so  that  ct(T^)  specifies  the  type  of  task  T^.  We  denote  the  time 
required  to  complete  a  task  (on  a 'type  ( j ,k)  processor)  by  u(T^) 
where  y  is  a  function  from  ST  to  the  reals.  u(T^)  shall  be  referred 
to  as  the  execution  time  of  T^. 

We  suppose  that  there  is  a  precedence  relation  <’  defined  over  the 
set  T.  That  Ti  precedes  T;  (or  follows  T^)  is  written  as  <’  TW  and 
means  that  th.e  execution  of  Tj  cannot  begin  before  the  execution  of 
is  completed.  A  task  is  said  to  be  executable  at  a  certain  time  if  the 
tasks  preceding  it  have  been  completed.  Formally,  a  set  of  tasks  are 
represented  by  an  ordered  quadruple  ( ^  We  use  the  notation 

(T^ad^ud^))  be  represent  a  particular  task  TW . 

Consider  all  type  (j,k)  processors  for  a  fixed  j.  The  smallest  sub¬ 
set  of  types  {( j |k1 ),(j,k2 ),..., (j»k^)}  is  said  to  be  the  dominating  set 
if  for  any  type  (j,k),  (j,k)  <  < j for  some  1  s  p  s  q.  Ln  other  words, 
any  type  ( j ,k)  task  can  be  executed  on  a  processor  whose  type  belongs  to 
the  dominating  set.  For  example,  in  the  multiprocessor  system  shown  in 
Figure  1.  {(1,1)},  {(2,1)}  and  {(3,1), (3, 2)}  are  dominating  sets.  We  also 
refer  to  a  type  (j,k)  as  a  maximal  type  if  it  is  in  the  dominating  set. 

For  a  given  dominating  set,  let 

m-0  =  min  {m^  { (j ,kp)  is  in  the  dominating  set}. 

Ln  other  words,  m.~.  is  the  minim  rn  of  the  numbers  of  processors 
among  all  types  or  processors  in  the  dominating  set.  Therefore,  is 
equal  to  3,  1,  and  1  for  j=l,2,  and  3,  respectively,  in  our  example. 

We  want  to  determine  the  performance  of  a  class  of  scheduling  al¬ 
gorithms  in  which  processors  are  never  left  idle  intentionally.  These 
algorithms  are  known  as  priority  driven  scheduling  algorithms  ana  can 
be  described  by  the  priorities  assigned  to  the  tasks  C  5 ] . 


|  ,  .. 
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Let  a)  and  of  denote  the  completion  tiroes  of  a  set  of  tasks  C-^,a,  u,<' ) 
when  executed  on  a,  system  5s  according  to  a  priority-driven  schedule  and 
an  arbitrary  schedule,  respectively.  The  ratio  of  w  to  u/  has  the  follow¬ 
ing  upper  bound 

v  m4  1 

rr  *  1  +  2  -—i-  -  min  — —  Cl)' 

w  j=l  mjo  l^j^r  mj0 

To  prove  this  inequality,  let  t^  denote  the  total  execution  of  all 
type  (j,k)  tasks  for  k  *  1,2,...  For  a  priority-driven  schedule,  let 
<$  be  the  total  idle  time  in  all  processors.  Hence  the  completion  time  of 
the  priority  driven  schedule  is  as  given  by 

uj  =  -  (  Z  t-  +  4>) 


Let  I  be  the  sum  of  lengths  of  all  idle  periods  during  which  at  least 
one  of  each  maximal  type  processors  is  idle.  Let be  the  sum  of  the  por¬ 
tion  of  execution  tiroes' of  all  type  (j,k)  tasks  scheduled  during  these 
periods.  There  is  a  chain  of  tasks  in  2T such  that  during  these 
idle  periods  one  of  the  other  processors  is  executing  a  task  in  the  chain. 


Hence 


Moreover, 


I  s  (m-Dw' 


E  s.  > 
j*l  J  m  * 


m 


Let  K-  denote  the  sum  of  lengths  of  idle  periods  during  which  all 
J  * 
the  processors  of  cne  or  tore  maximal  types  of  the  form  (j,  )  are  busy. 

♦ 

We  have 


*5  *  “  <VS<} 

j  mjo  j  J 


Combining  this  inequality  with  the  inequality  in  (2),  we  obtain 

r  r  m-m,0  -  m-m40 

ZK.sE  r~  ^  -  r=r  (min  — -*> . 

J  m-iO  J  m~~  i  <r 
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L 

Since  t.|nu  s  w*  for  j=l,2,...,r,  £  t^/m  s  w’,  and 

J  J  3=1  ■> 

r 

♦  s  I  +  E  K., 
j=l  3 


,  r  1  r 

a)  s  w1  [1  +  E  —■**“  -  -  E‘  nu  +  1  -  min 


|=t  "30  ~  “  jn"3 


lsjsr  mj0 


which  reduces  to  the  bound  given  by  Equation  (1).  Q 

When  the  dominating  sets  contains  only  one  subtype  for  all  j*l,2,. . . ,r, 
(that  is,  there  is  an  unique  maximal  subtype  for  all  types).  The  bound  given 
by  (1)  is  the  best  possible.  This  fact  can  be  demonstrated  by  an  example 
which  can  be  found  in  C5]. 

For  a  system  containing  m  identical  general  purpose  processors,  the 
upper  bound  in  (1)  reduces  to  the  well  known  result  [7] 

— "  <  7  m  - 

oj'  m  * 

On  the  other  hand,  for  a  -job  shop  problem,  m^sn^s. .  .*n^si.  In  this  case,  we 
have 


-t  ^  r. 

(D 


When  the  multiprocessor  system  contains  only  one  type  of  processors, 
we  have  nvpO  for  i=2,3,...,r.  The  bound  given  by  (1)  is  simply 


<  1  ^1 _ i_ 

<d'  m.. ,  m^1 


The  bound  derived  in  [3]  for  processors  with  different  storage  capacities 
is  a  special,  case  of  our  result  with 


'  **  ’  •  ♦-  *!  'v  ,\V:;gvl5l2i. 
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3. , Queueing  models  of  multiprocessor  systems 

To  model  a  multiprocessor  system  probabilistically,  we  assume  that 
the  arrival  process  of  jobs  requesting, service  of  the  system  is  Poisson 
with  parameter  X.  Each  job  may  be  decomposed  into  a  number  of  different 
tasks.  There  are  altogether  r  different  types  of  tasks.  (For  example, 
consider  a  system  in  which  some  jobs  are  decomposed  into  an  input  task 
followed  by  compilation,  computation  and  output  tasks  while  the  other 
jobs  are  decomposed  into  input,  sorting  and  merging,  and  output  tasks. 

In  this  case,  there  are  5  different  types  of  tasks).  We  way  that  these 
tasks  are  generated  by  the  job.  Let  denote  the  number  of  tasks  of 
type  i  and  N  be  the  total  number  of  tasks  generated  by  a  job.  We  assume 
that  N^'s  are  statistically  independent  random  variables. 

It  is  sufficient  to  consider  the  relative  speeds  of  the  processors. 
We  choose  to  measure  the  speeds  of  special  purpose  processors  with  res¬ 
pect  to  the  speed  of  a  general  purpose  processor.  In  particular,  we  call 
the  relative  speed  of  a  special  purpose  processor  with  respect  to  a  gene- 
ral  purpose  processor  the  capacity  of  the  special  purpose  processor-.  For 
example,  if  a  task  takes  1/u  units  of  time  to  be  completed  by  a  general 
purpose  processor,  then  it  takes  (1/u) (1/C)  units  of  time  to  be  completed 
by  a  processor  with  capacity  C. 

We  refer  to  the  time  required  to  complete  a.  task  in  a  system  as  the 
execution  time  (or  ser/ice  time)  of  the  task  in  that  system.  In  particu¬ 
lar,  the  execution  time  of  the  task  on  a  general  purpose  processor  is 
called  the  amount  of  work  for  that  task.  Hence,  if  a  special  purpose  pro¬ 
cessor  with  capacity  C  completes  the  given  task  within  t  sec,  then  the 
amount  of  work  for  that  task  in  t  .  C  units. 


We  measure  the  effectiveness  of  a  multiprocessor  system  by  the  average 
total  amount  of  work  remaining  in  the  system  in  statistical  equilibrium. 

That  is,  the  total  time  required  for  a  general  purpose  processor  to  complete 
all  tasks  being  served  and  waiting  for  ser/ice  in  the  system.  This  perfor¬ 
mance  measure  is  chosen  since  it  does  not  depend  the  queueing  discipline 
used  to  schedule  the  tasks  in  the  system. 


3.1.  Systems  with  independent  input  processes 

When  the  tasks  generated  by  jobs  are  independent 3 ,  we  approximate 
the  arrival  processes  of  different  types  of  tasks  by  independent  Poisson 
processes  .  Let  denote  the  average- arrival  rate  of  tasks  of  type  i. 

A  multiprocessor  system  containing  m.  general  purpose  processors  can.  be 
approximately  modelled  by  the  M/G/m  queue  shown  in  Figure  2a.  In  this 
case,  when  all  processes  are  busy,  the  tasks  joins  a  corrmon  queue  (for. 
example,  as  in  B  S5C0  C  2]).  Similarly,  a  multiprocessor  system  con¬ 
taining  r  types  of  special  purpose  processors  can  be  modelled  by  the 
muitiserver  system  in  Figure  2b.  Let  m^  denote  the  number  of  type  i 
processors  (i.e.,  those  designed  to  execute  tasks  of  type  i  only). 

These  processors  are  referred  to  collectively  as  the  ith  subsystem. 

A  type  i  tasks  joins  the  ith  queue  upon  its  arrival. 

It  is  difficult  to  analyze  the  general  behavior  of  multiserver 
queues  since  service  times  in  our  case  are  non-exponential.  We  consider 
here  several  special  cases  :  . 

3.1.1.  Systems .with  one  processor  of  each  type 

For  the  case  where  .  .sm^smsl,  expressions  for  average  total 

amount  of  work  remaining  in  both  types  of  systems  can  be  obtained  easily 
C9].  When  the  amount  of  work  for  type  i  tasks  is  exponentially  distributed 
with  parameter  y^(i=l,2, . . . ,r) ,  the  average  total  amount  of  work  in  the 
ith  subsystem  with  one  special  purpose  processor  of  capacity  is 


*« 

wnere  o .  =  ”ercs  ~“e  average  total  amount  of  work  remaining  in  a 

system  containing  r  special  purpose  processors  is  given  by 


In  the  sense  that  they  can  be  executed  simultaneously. 

Multiprocessor  systems  containing  identical  processors  may  be  modelled 
more  accurately  by  the  M/M/ro  bulk  arrival  queueing  system  discussed  in 
Section  IV. 
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ws =  q  i-Pi 


(3) 


Let  W  be  the  average  total  amount  of  work  remaining  in  a  system 

Si 

containing  a  general  purpose  processor..  Since  the,averaga  execution  time 
■of  all  tasks  is  equal  to 

r  r 


rr  Z  X,./u.  and  X'  =  £  X. 


we  have 


x-  -t-x 


i  r  Ai 
Wg  S  1-0  J,  ? 

wi 


i=l 


(4) 


where  p  =  Z  X./u.  . 


i=l 


i  i 


To  compare  the  performance  of  the  two  systems,  we  assume  that  the 

capacities,  C-,  of  the  special  purpose  processors  are  chosen  to  minimize, 

^ 

W  ,  subject  to  the  constraint  j  C.  =  C.  The  values  of  C.  that  minimize 
-  .  i-1  1  * 


Wg  are  given  by 


X, 


...  /E  r  .. 

ci  =  rttc-»>  tK*  ■W 

x  i  x=l  J  J 


For  these  values  of  C. ,  W  in  Equation  (3)  becomes 

iTj  .J. 

"sc 


Moreover,  W  s  '~t_  as  long  as  the  total  capacity  of  ail  special  purpose 
processors,  C,  is  such  that 


a  (1-0)  (  E  /X?/*i.)2  /  (  E  X  ■/>;?)  + 
i=l  1  x  i=x  x  - 


This  result  indicates  that  in  order  for  a  system  containing  special  pur¬ 
pose  processors  to  be  equally  effective  as  a  system  containing  general 
processors,  the  special  purpose  processors  must  be  made  sufficiently  fast. 


3.1.2.  Systems  with  arbitrary  number  of  processors 

To  compare  the  effectiveness  of  systems  containing  arbitrary  number  of 
processors  of  each  type,  we  consider  two  analytically  tractable  cases  : 


yluy-,<  >"iWMW  J'»B^»Bl 


-  11  ~ 


(i)  Deterministic  task  execution  time 


:  f 


v  % 


k 


In  particular,  the  amounts  of  work  required  for  all  tasks  are  cons¬ 
tant  and  identical.  Without  loss  of  generality,  let  the  amount  of  work 
required  by  a  task  be  one  unit  of  time  or  a  time  slot.  If  the  average 
interarrival  time  of  jobs  is  sufficiently  long  compared  with  this  time 
slot,  then  we  can  approxinate  the  exponential  distribution  of  interarri¬ 
val  time  by  a  geometric  distribution.  We  assume  that  the  job  scheduler 
assigns  tasks  to  the  processors  at  the  beginning  of  each  time  slot.  If 
t  is  the  nth  epoch  of  this  time  slot,  the  total  number  of.  tasks  at 

t  +0+  forms  a  Markov  chain  (and  so  is  the  total  amount  of  remaining  work 

n  + 
in  the  system  at  tn+0  ) . 

Let  (Ij.  be  the  probability  of  having  i  tasks  in  a  system  containing 
m  general  purpose  processors  at  equilibrium.  It  has  been  shewn  that  when 
the  number  of  processors  is  larger  than  XE(N) ,  the  average  number  of 
tasks  in  the  system  is  given  by  CIO 3. 

H  s  Jo  ^  ~  * A'(1)  *  2(^%t  » 

r 

where  A(z)  =  Z  A^(z)  is  the  generating  function  n f  the  random  variable 

N  and  A.  (z)  isf  the  generating  function  of  N . .  zr  ,z1 .....  ,z  -  are  the 
x  jn  x  v.  *.  ui  * 

zeros  of  (1  -  gfej)  within  the  unit  circle. 

In  a  multiprocessor  system  containing  r  types  of  special  purpose  pro¬ 
cessors,  the  execution  time  of  a  type  i  task  is  1/C,-.  Let  3^(z)  be  the  genera¬ 
ting  function  of  the  number  of  tasks  that  arrive  during  the  execution  of  a 
type  i  task.  Clearly, 


3i(z)  =  1  -  £  *  £  Ai(z). 


The  average  number  of  task  in  the  ith  subsystem,  n- ,  is  given  by 
mi’2  ,  m.(m.-l)  3V(1) 

73jn*:.mri;.(l))  +  2(m%S!<D)  ' 


n,  =  i 
x  i=0 


where  zW  are  zeros  of  (l-z“‘/A^(z))  within  the  unit  circle. 


Mi 
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Unfortunately,  it  is  difficult  to  obtain  the  general  behaviors'  of 
W  and  W_  except  for  the  case  where  A(z)  and  A.(z)  are  polynomials  of 
z  and  z  1  ,  respectively. 

(ii).  Tasks  with  exponential  execution  time  in  systems  under  heavy  traffic 

We  assume  here  that  the  amount  of  work,  S^,  required  for  type  i  tasks 
is  exponentially  distributed  with  parameters  vi^.  Let  S  =  S^+S^. .  ,+Sr  be 
the  total  amount  of  work  of  a  job  .  Let  us  denote  by  t  the  epoch  when 

't’H  ** 

the  n  job  arrives.  Let  Wn  be  the  amount  of  work  remaining  at  t 

in  a  system  containing  m  general  purpose  processors,  An  be -the  amount  of 

work  completed  in  the  duration  Ct  , ,t  3,  and  B„  be  the  amount  of  work 

n-i  n  •  n 

arriving  at  tn<  Then  W  ^  can  be  expressed  as 

I  “V  W° 

Vi*  (5) 

0  i£  W  +  B  -  A  <  0  . 

I  n  n  n 

•  •  * 

Under  heavy  traffic  conditions,  all  m  processors  are  busy  during 
Ctn-1,tn3.  Therefore,  An  has  the  same  distribution  as  mA  where  A  is  the 
interarrival  time  between  two  jobs. 

3n  has  the  same  distribution  as  S.  Hence  we  can-  write  Equation  (5)  as 

if  Wn  +  S  -  rrA  >  0 

if  W  +  S  -  mA  <  0 


Wn  +  (S-mA) 


\ 


r 


However,  the  expression  in  the  right  hand  side  of  this  equation  is  the 

•  •  •  "CaI 

waiting  time  it  the  (n+1)  *  job  in  a  system  which  consists  of  one  processor 
with  the  execution  time  of  a  job  being  S  and  the  average  interarrival  time 
between  jobs  being  mA.  Since 


ECmA]  -  f 


ECS1  =  E  «*• 
i=l  ui 

r*  r» 

9  "  1  ?  “  ?9 

ECS- 3  =  (  E  -±0Z  A  E  (— )2 

1*1  i=l  ui 


9 


The  mean  waiting  time  in  queue  is 


where 


|(X/m)  ECS2)  (4-) 

2  1-p 


p  =  (X/m)  Z  (1/uO 
i=l  1 


Moreover,  the  average  total  amount  of  work  retaining,  W ,  is 

O 

W  -  [  E  (*f)2  r  E  ~)2]  (6) 

g  2  m  1-p  Lisl  u.  i-i  Uj.  J 

Similarly,  in  the  system  contain  r  types  of  special  purpose  pro¬ 
cessors  under  heavy  traffic  conditions,  the  average  total  amount  of  work 
remaining  in  the  1th  subsystem,  W„. ,  is  equal  to  the  average  waiting 
time  in  queue  in  a  single  processor  system  with  interarrival  time  of  jobs 
being  C^m^A  and  the  execution  time  for  a  job  being  S. .  Since  S-  and  C^m^A 
are  exponentially  distributed,  we  have 

•  JL  .  • 

n  C<flU  UT  1 


where 


p<  =  (A/C  .m.  )(-=•) 
a  3.  i  u- 


'Thus  the  average  total  amount  of  work  retaining,  Wg,  is 


W.  =  Z  VL.  =  X  Z  "”>) 


We  can  minimise  the  value  of  VL  with  resoect  to  C-m-  under  the  ccndi- 

b  *  1  X 

tion  On  =  Z  C.m. .  It  can  be  shewn  [10]  that  the  values  of  C-m-  that  mini- 
i=l  11 

mizes  Wg  is  given  by 

C.m.  = 
i  a  pu,- 

Furthermore,  the  minimum  value  of  W_  is 


14  - 


That  is 


W  =  S  -£_ 
wSo  A  C-p 


Figure  3  shows  the  curves  and  W„  in  the  case  of  r=3,  (•—)  =  i+1 
for  i=0,l,2,  C^=C  and  m=10.  Under  heavily  loaded  conditions,  the  Optimized" 
system  with  special  purpose  processors  behaves  better.  Even  in  the  case 
C=1  (i.e.,  no  Improvement  of  processor  speed),'  the  value  of  .-is  almost  . 

comparable  to  that  of  W  .  This  result  is  an  expected  one  since  the  flexi¬ 
bility  in  scheduling  does  not  make  much  difference  under  heavily  loaded 
conditions. 

Figure  4  describes  another  case  where  r  *  m  =  3,  (jp>  *  (0.9)1, 

€•  *  C  for  i=0,l,2.  Therefore,  m,-=l  for  all  i.  Since  all1subsystems  con¬ 
sist  of  a  single  processor,  is  not  optimized  in  this  case. 

3.2.  Queueing  network  model 

There  are  two  methods  to  schedule  jobs  in  a  multiprocessor  system  : 

(i)  multiprogramming  and  (ii)  multitasking.  By  multiprogramming,  one 
usually  means  that  the  system  nay  execute  more  than  one  job  simultaneously. 

3y  multitasking,  one  means  that  the  system  nay  execute  tasks  in  a  job  concur¬ 
rently  if  the  job  can  be  divided  into  independently  executable  tasks.  Multi¬ 
processor  systems  without  multitasking  can  be  modelled  using  queueing  net¬ 
works.  We  model  a  multiprocessor  systems  containing  special  purpose  pro¬ 
cessors  approximately  as  an  open  network  of  Jackson  servers .  Let 
R.j  be  the  probability  of  a  task  of  type  j  is  to  be  executed  after  the  com¬ 
pletion  of  a  task  of  type  i.  Let  f-  be  the  probability  that  the  first  task 
to  be  performed  for  a  job  is  of  type  i,  then  the  arrival  rate,  A,-,  of  tasks 
of  type  i,  i*i,2,...,N  are  given  by  the  set  of  equations 


*i 5  *  £  ¥ji 


According  to  Jackson's  theorem  when  An-/m;C,. <  1  for  all  i,  the  equi¬ 
librium  prooability  of  finding  n-  tasks  in  subsystem  i  is  given  by 


p^)  p2(n2)  ...  pr(nr) 


where  p^(n^)  is  the  equilibrium- probability  of  finding  tasks  in  ah 

queue  with  input  rate  X^  and  average  execution  time  4/u^C,. .  Hence, 
for  a  given  set  of  X^,  our  results  described  above  are  still  valid 
here. 

A  special  case  of  interest  is  one  where  each  job  generates  r  dif¬ 
ferent  types  of  tasks  that  must  be  executed  in  fixed  sequence  as  in  a 
pipelined  system.  In  this  case,  a  system  consisting  of  special  purpose 
processors  can  be  modelled  by  a  series  of  r  M/M/m-  queues-.  With  the 
average  execution  time  of  a  job  on  a  processor  in  the  ith  subsystem 
being  l/u^C- ,  the  average  amount  of  work  remaining  in  the  subsystem 
containing  m-  type  i  processors  is 

R\  a  J!L  -L 

Sl  l-°i  u. 

where  pi  =  X/S^m^u^.  Hence  the  average  amount  of  work-  renaming  for  all 
jobs  in  the  ith  subsystem  is 


where  n.  is  the  average  number  of  jobs  in  the  ith  system.  Since  n^=o'/(l-pp 
we  have 

_  «,*  *  *1 

The  average  total  number  of  work  remaining  in  the  pipelined  multiprocessor 
system  is 


w  s  r  r  —  ..i,,—  -i. 

S  .  •  C-m-y.-X  u. 

1=1  3=1  iVi  w3 

Mote  that  in  a  system  with  m  general  purpose  processors,  the  processors 

are  not  connected  in  pipeline.  Hence  the  expressions  for  W_  in  Ecuaticn  (6) 

S 

is  valid  here  also  m  heavy  loading  condition. 


Figure  5  describes  the  behavior  of  and  in  the  case  of 
0,  =  c  and  (-*•)  =  (0.9)1  ,  i=0,l,2. 


=  m  =  3, 
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4.  Performance  of  multiprocessors  sytems  with  multitasking 


Generally,  jobs  nay  contain  independent  and  thus  concurrently  exe¬ 
cutable  tasks.  Both  turn  arround  time  and  system  resource  utilization 
in  multiprocessor  system  may  often  be  improved  by  allowing  processors 
to  execute  in  parallel  tasks  identified  either  by  programmers  of  the 
jobs  or  by  the  compiler  system  to  be  concurrently  executable.  In  this 
section,  we  want  to  determine  the  potential  performance  improvement  by 
multitasking.  Unfortunately,  the  models  used  in  Section  II  and  III  are 
deficient  in  one  way  or  another  for  this  purpose.  In  the  deterministic 
model,  the  structures  within  an  individual  job  are  effectively  described 
in  the  general  model  of  a  task  set.  However,  all  jobs  have  the  same  struc¬ 
ture  and,  more  restrictively,  are  assumed  to  arrive  for  service  at  the 
same  time.  On  the  other  hand,  in  the  queueing  models  used  in  Section  III, 
the  issue  of  task  synchronization  is  completely  ignored.  The  model  des¬ 
cribed  in  Figure  2  can  be  used  only  in  the  case  where  jobs  consist  of  inde¬ 
pendent  tasks  while  queueing  network  models  of  multiprocessor  systems  can 
be  used  only  for  systems  without  multitasking.  '• 


When  the  degree  of  concurrency  is  snail  (=2), the  model  in  Cll]  can  be 
used.  However,  the  case  with  degree  of  concurrency  being  two  is  not  an  inte¬ 
resting  one  since  it  has  been  shown  that  multitasking  is  not  a  good  way 
to  make  effective  use  of  multiprocessor  system  resources  when  the ‘number 
of  processors  is  snail.  This  result  is  due  to  the  fact  that  negative 

of  larger  overhead.  On  the  other  hand,  it  is  said  that  multitasking 
is  essential  for  a  multiprocessor  system  containing  a  large  number  of  pro¬ 
cessors  [12J. 


We  study  here  the  dependency  of  potential  improvement  achievable  by 
multitasking  on  the  degree  of  concurrency  and  number  of  processors.  For 
this  purpose,  we  discuss  first  the  type  of  job  strictures  considered  here 
and  then  a  model  of  multiprocessor  systems  with  multitasking. 

4.1.  Job  structures 

Similar  to  our  deterministic  mcdei,  the  structures  of  jobs  can  be 
described  by  a  probabilistic  model  represented  graphically  as  shewn  in 
Figure  Sa.  In  this  graph,  ’  represents  a  task,  ' { '  represents  the  task 
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generation  (e.g.  by  statements  such  as  fork,  cobegin,  etc.),  and 
represents  the  task  synchronization  (identified  by  statements  .such  as 
join,  coend  etc.).  A  job  consists  of  many  stages  of  tasks.  A  stage 
is  either  a  set  of  tasks  in  the  same  column  between  a  pair  of  '  { ’  and  " 

1 ,  or  a  task  if  it  is  not  immediately  preceded  by  ' { ’ .  A  stage  con¬ 
sists  of  a  random  number  of  tasks,  and  service  times  of  tasks  in  each 
stage  are  stochastically  independent.  Tasks  in  the  same  stage  can  be 
executed  independently.  Tasks  in  an  inner  stage  (i.e.,  tasks  in  inner 
brackets)  are  considered  as  a  part  of  an  outer  stage.  The  execution  of 
tasks  in  the  stage  to  the  left  begins  before  the  tasks  in  stages  to  its  r 

right.  It  is  difficult  to  model  the  topological  structure 'of  the  job, 
i.e.,  relatons  between  stages.  Therefore,  we  consider  here  two  approxi¬ 
mate  models  of  the  job  structure  :  (1)  no  synchronization  model  shown 
in  Figure  6b,  and  (2)  full  synchronization  model  shown  in  Figure  6c. 

In  no  synchronization  model,  we  assume  that  there  is  no  task  synchroni¬ 
zation.  A  task  simply  disappears  after  being  served  with  certain  proba-  r 

bility,  or  it  is  followed  by  another  stage.  In  full  synchronization  model, 
tasks  may  hot  generate  new  tasks;  they  mast  be  synchronized  immediately 
after  their  completions.  Thus  there  are  r.o  inner  stages  in  a  stage.  Mo 
synchronization  model  preserves  the  topological  structure  of  the  job 
to  a  certain  degree,  and  can  represent  complex  job  structures.  However,  it 
is  difficult  to  keep  track  of  a  job  ;  one  cannot  use  it  for  the  purpose  r 

of  evaluation  of  job  turn  arround  time.  Full  synchronization  model  reduces 
the  topological  structure  into  simple  linear  structure  and  is  easy  to  ana¬ 
lyse.  It  is  the  model  of  job  structure  used  in  the  following. 

4.2.  Queueing  model  of  multiprocessor  systems  with  multitasking 

We  use  open  queueing  networks  such  as  the  one  shown  in  Figure  7  to  model 
multiprocessor  systems  with  multitasking.  The  network  has  only  one  external 
source  and  one  sink.  Each  node  in  the  network  consists  of  a  number  of  pro¬ 
cessors  and  serves  a  stage  of  tasks  in  a  job.  These  processors  nay  be  of 
different  types  if  the  stage  which  it  serves  consists  of  several  classes 
of  tasks.  After  the  arrival  of  a  job  to  a  node,  tasks  in  the  stage  are 
served  concurrently  whenever  possible.  The  service  of  a  job  in  a  node  is 
completed  when  services  of  all  tasks  in  the  same  stage  are  completed.  The 
completed  job  proceeds  to  a  next  node  (or  leave  the  network)  determined 
randomly  with  given  probabilities.  Thus  a  job  may  be  repeatiy  served  by 


a  node.  The  number  of  tasks  in  a  stage  may  vary  from  node  to  node. 

In  general,  it' is  impossible  to  find  the  probability  distributions 
of  number  of  jobs  in  each  node  in  such  queueing, network.  Again,  we  con-  - 
sider  here  only  one  special  case. 

4.3.  A  node!  for  multitasking  systems  containing  identical  processors 

Multiprocessor  systems  with  identical  general  purpose  processors 
can  be  approximately  modelled  using  the  single  node  model  in  Figure  8. 
Again,  we  assume  that  the  job  arrival  is  a  Poisson  process  with  the 
average  arrival  rate  X.  Each  job  consists  of  N  independent  tasks  with 
identically  distributed  service  times.  The  distribution  of  N  is  arbitrary 
but  has  finite  first  and  second  moments.  N  is  referred  to  the  degree  of 
concurrency  in  the  job.  The  service  times  of  tasks  are  statis  ..cally  inde¬ 
pendent,  exponentially  distributed  with  mean  1/y.  The  job  service  time 
is  the  sum  of  task  service  times,  i.e.,  -  S^+Sj*. . .  Thus  the  model 

of  multiprocessor  systems  is  specified  by  a  4-tuple  (X,u.m,N).  It  is  a 
bulk  arrival  M/M/m  queueing  system. 

Number  of  tasks  in  system 

Let  a^  be  the  probability  that  N  is  equal  to  i  for  i=0 ,1,2,...  . 

(a  =0  and  a.  e  0  for  irO)  and  A(z)  the  generating  function  of  {a.}. 

Let  K  be  the  number  of  tasks  in  M/M/m  bulk  arrival  queueing  system « 
is  a  Markov  chain.  Its  stationary  distribution  {p^} ,  if  exists,  is  given 


by  [13] 


?(z) 


m-1 

(l-z)u  Z  (m-k)p.z‘ 
k=0 


mu ( 1-z )  ■ - Xz ( i-A( z ) ) 


where  uk  =  min(ku,mu).  Let  p  =  X/u  and  u  =  XE(M)/mu.  Because  is  a  linear 

function  of  pQ,p,,p2,...,pk-1,  we  write  p,<  =  f:<pQ  for  k  <  m.  Hence 

m-1  k 

m(l-u)(l-z)  Z  (m-k)f,  z‘ 

h— A  ** 


m-i 

[m(l-z)-oz(l-n(z))3C  Z  (m-k)f,  ] 

k=0 


m  <  »  • 


The  condition  under  which  the  stationary  distribution  exists  is 
u  =  X£(N)/mu  <  1.  It  is  easy  to- see  that  u  corresponds  to  the  utiliza¬ 
tion  factor  of  the  processors.  For  m  =  ® 


P(z)  =  exp{-p 


C(l-A(t) )/(l-t)]dt} 


Thus,  the  average  number  of  tasks  in  the  system  is 


u(l+E(N“)/E(N)) 


E(K)  = 


m-i 

Z  (m-k)kf 
k=i _ 


k 


2<l-u) 


m-1 

Z  (m-k)f. 
k=0  '  K 


For  m  <  «,  for  m  =  «,  ECO  =  pE(N), 


(8) 


Task  waiting  time 

To  find  the  task  waiting  time  for  FCFO  (First  Come  First  Serve)  disci- 

•  • 

Diine  ,  we. define  the  virtual  task  waiting- time  as  the  length  of  time  a 
task  spends  in  the  system  (or  in  the  queue)  if  it  arrives  at  random  instant. 
Then  it  can  be  shown  that  the  distribution  of  the  task  waiting  time  is 
the  same  as  the  distribution  of  the  virtual  task  waiting  time  [133.  Thus 
we  can  get  the  task  waiting  time  by  calculating  the  virtual  task  waiting 
time. 


Let  Wj  be  the  waiting  time  (in  the  queue)  of  the  j-th  task  in  a  job 
when  we  number  tasks  in  the  order  of  them  being  served.  If  there  are  k 
tasks  in  the  system  just  before  its  arrival, 

0  k+j  <  m 

E(WjK=k)  = 

j 

(k+j-m)/mu  k+j  >  m 

The  above  expression  is  obtained  because  all  processors  are  busy  while 
the  j-th  task  (in  the  arrived  job)  is  waiting.  Moreover ,  the  average  inter¬ 
departure  time  is  equal  to  the  minimum  of  the  service  time  of  m  servers,  i.e. , 
l/mu  when  all  servers  are  busy.  The  unconditional  average  is, 


m 
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ECO  =  -r 
3  my 


E(K)  +  (j-m)  + 


m(l-u)  E  (m-j-k)f. 
k=0 


rl- 

L  (m-k)f, 
k=Q 


k 


Let  W  be  the  waiting  time  in  the  system  of  a  task  and  ^  be  the  waiting 
time  in  the  queue  of  a  task.  Then, 


0°  ] 

Pr(fcL  >  t)  s  Z  a.  E  Pr(W.  '  t)/E(N). 

q  j=l  3  in  1 


Hence, 

oo  i 

E(W)  =  Z  a.  E  E(W.)/E(N)  +  1/y 

in  1  jn  3 

The  job  waiting  time 


Let  V  be  the  waiting  time  in  system  for  a  job,  i.e.,  the  time  between 
the  arrival  and  the  departure  of  a  job.  Here  the  departure  of  a  job  means- 
the  departure  of  the  last  task  in  a  ‘job  to  leave  the  system.  Let  Kr  be  the 
number  of  tasks  in  a  job  still  remaining  in  the  system  when  service  for 
the  last  task  in  the  job  begins.  (Notice  that  the  last  task  to  be  served 
is  not  necessarily  the  last  task  to  leave  the  system) .  Let  Sr  be  the  time 
required  to  complete  the  service  of  all  remaining  tasks  in  the  job  after 
the  service  of  the  last  task  begins.  The  conditional  average  of  Sr  is, 


E(Sr|Kr  =  n)  =  E(rrax(S1  jS^,...  ,Sn)). 

n 

=  (1/u)  E  1/i 

in 

Because  V  =  W,  r  Sr  where  W.  is  the  waiting  time  in  Queue  of  the  last  task, 

4  x, 

m  n 

E(V)  =  ECWJ  +  (1/u)  E  ?r(Kr  =  n)  E  1/i 
4  n« 1  i=l 

It  is  possible  to  compute  E(V)  as  cone  in  [13].  However,  the  computa¬ 
tion  of  its  numerical  values  is  not  efficient.  The  fc Hewing  bounds  of  E(V) 
are  more  useful.  Because  E(Sr! Y~-n)  is  an  increasing  function  of  n  and 
clearly  ?r(Xr=n)  =  0  for  r.=0  and  n  >  min(N,m),  we  have  the  lower  bound 


E(V)  a  E(WZ)  +  1/u 


and  the  uocer  bound 


-  21  ■ 


«  min(j,m) 

E(V)  s  E(W.)  +  (1/u)  Z  a4  Z  1/k' 

*  j=l  3  k=l . 

Together,  these  bounds  give  us  a  good  approximation  of  E(V).  In  order  to 
emphasis  the  fact  that  E(V)  is  a  function  on  m,  we  write  it,  as  EmCV) . 


4.4.  Performance  improvement  with  multitasking 

The  queueing  system  may  be  adopted  to  model  multiprocessor  systems 
with  or  without  multitasking  in  addition  to  uniprocessor  systems  with, 
the  same  capacity  as  the  multiprocessors  (i.e.,  it  is  m  times  faster)  by 
a justing  the  4  parameters .  To  do  so,  we  note  first  that  the. performance 
of  multiprocessor  systems  my  be  decreased  by  memory  interference  and 
overhead.  These  effects  can  be  taken  into  account  by  assuming  that  the 
service  time  for  a  job  increases  in  multiprocessor  systems.  Let  S  be  the 
service  time  for  a  job  in  uniprocessor  systems.  We  assume  that  aS  and 
bS  are  the  service  time  ^cr  a  job  in  multiprocessor  systems  with  and 
without  multitasking  respectively.  The  values  of  a  and  b  depend  on  m,  N 


and  X.  Generally,  n  b  j  1. 

* 

In  a  processor  system  with  multitasking,  a  job  departs  only  when 

all  tasks  are  completed.  The  4 -tuples  which  specifies  the  model  for  the 

system  is  (X,uE(N)  ,m,N) .  In  a  multiprocessor  system  with  multiprogranning 

but  not  multitasking,  all  tasks  in  a  job  are  executed  in  sequence.  Hence 

only  one  processor  is  assigned  to  a  job.  But  many  jobs  may  be  executed 

simultaneously.  Ln  this  case,  the  4-tuple  is  (X,u/b,m,l).  Similarly,  for 

a  uniprocessor  system  the  4-tuple  is  (X, mu, 1,1)  since  the  processor  is  m 

times  faster.  All  equations  in  this  section  are  expressed  in  terms  of 

(X,u,m,N) ,  where  parameters  should  be  substituted  by  different  parameters 

for  different  systems.  Moreover,  since  E  (X)  should  be  interpreted  for 

m 

different  systems,  (i.e.,  it  is  the  average  number  of  tasks  for  the  system 

with  multitasking  and  is  the  average  number  of  jobs  for  the  system  without 

multitasking)  we  choose  again  to  use  the  total  amount  of  work  remaining 

E  (R),  instead  of  E  (X)  as  a  criterion  of  comparison  (E  (R)  is  defined  as 
m  .m  m 

the  required  time  to  complete  all  remaining  tasks  (or  jobs)  in  the  system 

yy  a  unit  steed  processor).  Net?  that  E  (R)  is  independent  on  service  dis- 

m 

cipiines  (priority,  etc.)  of  systems  although  E„(X)  is  not .  E^(R)  is  obtained 
as  follows  :  (i)  for  a  multiprocessor  system  with  multitasking 


E  (R)  =  a£m(K)/uE(N) 
m  m 

where  E  (K)  is  obtained  from  Equation  (3)  by  substituting  (X,u,m,N)  by 
<X,y£(N)Va,m,N),  (ii)  for  a  multiprocessor  system  without  multitasking 

E_(R)  =  bE  (K)/u 
m  m 

where  E  (K)  is  obtained  from  Equation  (3)  for  the  4-tuple  (X,u/b,m,l) 
m 

(iii)  for  a  multiprocessor  system 

E_(R)  =  E  (K)/u 
m  m 

where  E  CK)  is  obtained  from  Equation  (3)  for  the  4-tuple  (X, mu, 1,1) 
m 

The  values  of  parameters  a  and  b  can  be  determined  based  on  the  study 
of  memory  interference  and  overhead.  Here,  we  assume  the  ideal  case,  i.e. , 
a  =  b  =  1,  in  the  following  comparison.  From  Figure  9,  we  note  that  the  ’ 
performance  of  the  uniprocessor  system  is  the  best  and  the  performance  of 
the  multiprocessor  system  without  multitasking  is  the  worst  consistency 
for  all  E(N) .  Their  difference  is  larger  for  larger  value  of  m.  The  per¬ 
formance  of  the  multiprocessor  system  with  multitasking  improves  for  lar¬ 
ger  E(N) .  Indeed,  the  performance  of  the  multiprocessor  system  with  multi¬ 
tasking  is  almost  same  as  the  performance  of  the  uniprocessor  if  E(N)  z  m. 
The  effect  of  the  number  of  processors  on  the  performance  of  the  systems 
is  shewn  in  Figure  10.  The  traffic  intensity  plotted  is  proportional  to 
the  number  of  processors.  E^(R)  increases  as  m  increases  although  Em(R)/m 
decreases,  and  the  increasing  rate  is  larger  ror  smaller  E(H) •  On  the  ot*.er 
hand,  E  (V)  decreases  as  m  increases  and  the  decreasing  rate  is  larger  ror 
larger  E(N) .  If  we  assume  that  traffic  intensity  is  constant,  both  perror- 
rrance  measures  are  improved  in  ail  systems  as  m  increases,  oowever,  the 
difference  of  performance  between  these  systems  is  smaller  for  larger  u 
( system  utilisation  factor). 
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It  is  our  objective  here  to  evaluate  the  merits  of  using  special  pur¬ 
pose  processors  in  multiprocessor  systems.  For  this  purpose,  we  propose 
several  models  of  multiprocessor  systems  and  use  them  to  obtain  different 
performance  measures  which  nay  be  used  as  criteria  ‘for  comparison  of  the 
processing  capabilities  of  the  two  types  of  multiprocessor  systems. 

The  general  deterministic  model  of  multiprocessor  system. described 
in  Section  II  include  as  special  cases  nany  models  (e.g.  models  of  sys¬ 
tem  with  identical  processors,  processors  of  different  site  memories  and 
different  processors  in  job  shop  problems)  used  in  previous  studies. 

We  nany  conclude  from  the  result  in  Sew-  ion  II  that  according  to  a  prio¬ 
rity  driven  schedule,  the  completion  ti.  a  of  a  set  of  tasks  executed 
on  a  system  containing  r  types  of  processors  can  be  very  poor  for  large  r. 
The  relative  inferior  performance  of  multiprocessor  systems  containing 

special  purpose  processors  is  clearly  due  to  the  loss  in  scheduling 

« 

flexibility  in  such  systems.  When  such  a  system  is  used  in  real-time 
environments,  scheduling  algorithms  with  better  worst  case  behavior  than 
arbitrary  priority  driven  schedule  need  to  be  found. 


To  determine  the  minimum  ratio  between  the  speeds  of  special  purpose 
processors  and  general  purpose  processors  to  achieve  the  same  overall 
system  capabilities,  several  approximate  queueing  models  are  proposed. 

Using  the  total  amount  of  remaining  work  in  system  as  a  basis  of  compa¬ 
rison,  the  two  types  of  multiprocessor  systems  are  compared  quantitatively 
for  different  speeds  of  the  special  purpose  processors  in  the  case  when  the 
systems  ere  multiprogranted  (cut  are  not  multitasked) . 

It  is  'different  to  model  multiprocessor  systems  with  multitasking  in 
general.  We  discuss  a  probabilistic  representation  of  job  structures  and 
a  general  queueing  network  model  for  systems  with  multitasking.  The  model 
may  be  used  in  simulation  studies  but  is,  unfortunately,  analytically  un- 
tractable.  We  purpose  here  to  use  a  M/M/ra  queueing  system  with  bulk  arrival 
as  an  approximate  model  of  systems  with  multitasking.  The  performance  of 
multiprocessor  systems  with  and  without  multitasking  are  compared  with  an 


uniprocessor  system  with  equal  capability.  Our  results  confirm  that  multi¬ 
tasking  improves  the  performance  of  multiprocessor  system  when  the  degree 
of,  concurrency  in  jobs  is  large  enough. 
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